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Abstract. This paper is devoted to the random generation of particu- 
lar colored necklaces for which the number of beads of a given color is 
constrained (these necklaces are called v -balanced). We propose an effi- 
cient sampler (its expected time complexity is linear) which satisfies the 
Boltzmann model principle introduced by Duchon, Flajolet, Louchard 
and Schaeffer (7). Our main motivation is to show that the absence of 
a decomposable specification can be circumvented by mixing the Boltz- 
mann samplers with other types of samplers. 



Introduction 

Necklaces are classical objects in combinatorics (10; 6; 13; 14). They naturally 
occur in the study of Lyndon words or in many other enumeration problems (5). 
For v = (vi, ...,Vk) a fc-tuple of positive integers, our interest lies in uniformly 
drawing ^-balanced necklaces of n beads : the beads can take k distinct colors 
and the number m of beads of color i verifies the v-balance (which we define 
as meaning that (ni, ...,nk) is collinear to v). An additional reason to focus 
on ^-balanced structures derives from the intrinsic difficulty to enumerate and 
describe such objects in terms of analytic combinatorics. In particular, the gen- 
erating function of v-balanced cycles is neither holonomic nor closed-formed. We 
attempt to draw very large necklaces in order to conjecture some limit properties 
on them. To that purpose, we adopt the framework of Boltzmann samplers. This 
approach, like the recursive method (15), allows to "automatically" build a sam- 
pler from a decomposable specification of a combinatorial class. A Boltzmann 
sampler does not guarantee the size of the generated object, only that the drawn 
object has the same probability to be drawn as any other of the same size. More 
details on this point can be found in the preliminary section. But we can already 
state that for a large number of combinatorial classes this relaxation allows to 
generate an object of size n in expected linear time 0(n) without preprocessing. 

The main problem is that v-balanced necklaces do not admit decomposable s 
using traditional builders (+, x, Seq, ...), except for some very special cases. The 
aim of this paper is to adapt an ad-hoc sampler for u-balanced sequences, in such 
a way that the sampling follows a Boltzmann model. After that, we will use this 



Boltzmann sampler (which cannot be built from a decomposable specification) to 
obtain a Boltzmann sampler for f-balanced cycles. To this end, we need to know 
how to obtain a Boltzmann sampler for a class A from a Boltzmann sampler for 
its pointing class OA. 

This paper is organized in four sections. The first section defines the notations 
and concepts used throughout the article. After reflecting upon basic notions 
related to combinatorial classes and Boltzmann samplers, we define our classes 
of interest, namely, the u-balanced sequences and the u-balanced cycles. 
The second section addresses the sampler for (1, l)-balanccd cycles. We show 
two ways of building such a sampler. The first one uses a natural isomorphism 
with a decomposable combinatorial class involving Dyck paths. Unfortunately, 
this approach cannot be generalized, therefore we propose a second one based on 
an isomorphism between pointed (1, l)-balanced cycles and the weighted sum of 
(1, l)-balanced sequences. These sequences can be drawn with an ad-hoc sampler. 
In the third section, a generalization of the previous approach to f-balanccd 
cycles is given, yielding an efficient Bolzmann sampler which, for our object of 
study, bypasses the need for a specification of the class. 
The fourth section concludes with some perspective works. 



1 Preliminaries 

An (unlabelled) combinatorial class is a couple (C,s) (generally abbreviated by 
C) where C is a set of objects and s is a function on C called size function which 
satisfies the conditions : 

(i) Vfl £ C, s(a) is a non- negative integer; 

(ii) the number of objects of any given size is finite. 

We naturally associate to a combinatorial class C the ordinary generating func- 
tion C(x) = ^2c n x n where c„ is the number of objects of size n in C. A Boltz- 
mann sampler for an unlabelled combinatorial class C is a random generator 

x n 

such that the probability to draw an object a of size n is P x (a) — These 

C{x) 

samplers were first introduced in (7) and extended, (in particular the sampler 
for unlabelled cycles) in (9). Let us notice that a Boltzmann sampler depends 
on a parameter x which can be tuned to focus on an expected output size. More 
precisely, let N be the random variable of the size of the output, we can solve 
C'(x) 

the equation E(N) = x ■ — , to center the output distribution on an expected 

C(x) 

value. 

Let us recall in the following table the samplers for the unlabelled operators 
that we need. 



Boltzmann samplers are very powerful tools to efficiently generate combina- 
torial structures (2; 8). In particular, It can be possible to automatically build 
a sampler according to the specification of a combinatorial class. Our aim is to 



Sampler 


Description 


r x (z) 


Return Z. Z denotes an atomic class. 


r x (AB) 


Return (r x A,T x B). 


r x {A + B) 


llBernoulh{ A{x) ^ (x) ) 1 
then Return r x A 
else Return F X B 


r x Seq(C) 


Draw / according to a geometric law of parameter C(x). 
Return the concatenation of / calls to F X A. 


r x Rep n (C) 


Return aa^a with a generated by F x nA. 

n times 


r x Cyc(C) 


Let if be a random variable in N* verifying : 

¥(K = k)= k cy C A{x) ~ A ( xh )) with CycA(x) the generating function of Cyc(A) 
Draw k according to the law of K. 

Draw j according to a logarithmic law of parameter A(x ). 
Let M by the concatenation of j calls to r x kA 
Return MM...M. 

k times 



Fig. 1. Some classical samplers with A, B and C combinatorial classes, A(x), B(x) and 
C(x) their respective generating functions. C must not have neutral objects. 



consider Cyc v and Seq v defined below, as additional basic classes, to be added 
to the collection of classical constructions thus increasing the expressivity of the 
Boltzmann model. 

Definition 1. We denote by Seq v (resp. Cyc v ) a sequence of atoms (objects of 
size 1), such that each atom can be colored by one of the color in {1, ...,£;} and 
the numbers rii of beads of color i verifies the v-balanced condition. That is to 
say (m,...,nfc) is collinear to v. 

From an easy observation, it can be seen that the generating function of 

Seq( Vl Vk ) is y~^( ^ f) V ^'\ where |i>| = v i ■ The following proposition 

is a trivial consequence. 

Proposition 1 (see e.g (1)). The class Seq v is holonomic for every v. 

Proof. Indeed, a quick calculation shows that this can be written as the hyper- 
geometric function : 

TP 1. M.il «i l_ V2 l_ 2_ v k -l _ HH H 

\v\ \v\ \v\ vi vi vi v 2 v 2 v k v k v k [[v^ 

The hypergeometric functions are holonomic by definition. 

Remark: Nevertheless, the class Seq v is algebraic in only very few cases. For 
instance, Seq^i^ is not algebraic. See (1) for a complete classification of the 
algebraic cases. 



2 Generating (1,1) -balanced cycles 



This section is dedicated to the random generation of (1, l)-balanced cycles. This 
is a good example to illustrate our approach. A (1, l)-balanced sequence (resp. 
cycle) is by previous definition a sequence (resp. cycle) of black atoms Z^ and 
white atoms Z w such that the number of black atoms is equal to the number of 
white atoms. The generating function for Seqn t u isSVi) = i-Fb(l/2, 4x 2 ) = (1 — 
4x 2 )~i . In particular, Seq^^ is algebraic. We are going to use this important 
property in the following first approach. 

2.1 First approach: through a decomposable specification 

Both Seq^i) and Cyc(i_\) are specifiable. We can thus apply the unlabcllcd 
samplers described in the previous table. More precisely, Seq^ t \) is a classical 
combinatorial notion, names bridges. To generate Ct/cni), we use an isomor- 
phism between the (1, l)-balanced cycles and the cycles of indecomposable Dyck 
paths. 

Let T> be the class of Dyck paths of specification V = Seq(y T> \). Dyck 
paths are excursions from (0, 0) to (0, 2n) over the discrete lattice Z+ x Z+, with 
displacements of (1, 1) and (1,-1). This can also be viewed as the class of well- 
formed parentheses strings. Bridges are defined by Seq((/* V \) + (\ T> /*)), 
where V is like V, but with the roles of \ and / interchanged. So, we can 
generate Seqn^ by classical Boltzmann sampler principles. 

The class of indecomposable Dyck paths is the class of specification f T> \. 

Proposition 2 (Raney's lemma). The balanced cycle can also be specified as 
Cyc(i > i) ~ Cyc(y V \), where Cyc is the classical constructor for cycles. 

Proof, (sketch) We can represent Cyc^i) as an excursion from (0, 0) to (0, 2n), 
with steps of (1, 1) and (1, —1), up to circular permutations. Such an excursion 
has a non-empty set S of points of minimal abscissa. As we deal with excursions 
up to circular permutations, we can consider that the excursion begin at any 
c in S (see Fig. 2). In this case, we have Dyck paths and we remark that they 
are exactly the same up to circular permutations of their indecomposable Dyck 
paths. 



Cycl(o • • • o • o o) <-> ,/\ = ^/_ a __\ <-» Cycl( Ss. ./___\) 





c 



Fig. 2. Isomorphism between Cyc^i ^ and Cyc{f V \). 



Now, with this specification, we can use usual Boltzmann samplers for unla- 
beled structures to draw (1, rebalanced cycles. We obtain a Boltzmann sampler 
for Cyc(y T> \) by a combination of the previous mentioned samplers according 
to the specification. In the following algorithms D(x) and C(i,i) (x) respectively 
represent the generating functions of V and Cyc(oD»). 

Algorithm 1: F X T> 
Input: the parameter x 
Output: an object of V 

1 Draw I according to a geometric law of parameter x 2 D{x) 

2 M := e 

3 for i from 1 to I do 

4 |_ M := concat(M, o, r x V, •) 

5 return (M). 

Algorithm 2: r x Cyc(i^ 

Input: the parameter x 
Output: a (1, rebalanced cycle 
l Let K he & random variable in N* verifying : 

^ HK = k) = 1 ^L r) \o g (l-x^D(x^) 

3 Draw k according to the law of K. 

4 Draw j according to a logarithmic law of parameter x 2k D(x k ). 

5 M := e 

6 for i from 1 to j do 

7 L M := concat(M, o, r x V, •) 

return [ MM M] . 

® k times 

This method is an efficient way to draw (1, l)-balanced cycles. In particular, 
the basic rejection sampler iiCyC(i^(x\n,e) (see (7) for details) has an 0(n) 
overall cost in average. But it relics on a very singular property of the class : it 
can be decomposed with usual constructors. We are now interested in another 
way to generate these objects. This new approach will be extended to all v- 
balanced objects in the last section. 

2.2 Second approach: mixed samplers 

We are still focused on the generation of (1, rebalanced cycles, but now the use of 
an algebraic specification is avoided. The idea of our sampler can be summerized 
as follows : first, we adapt an ad- hoc sampler for the (1, rebalanced sequences in 
such a way that this sampling follows a Boltzmann model; second, we show an 
isomorphism (see proposition 4) between the class OCyc^i^ of pointed balanced 
cycles and a sum involving duplications of Seq^i^ . The notion of pointing classes 
is recalled in this part; finally, to obtain a Boltzmann sampler for Cyc^i,i), we 



explain how to obtain a Boltzmann sampler for a class A from a Boltzmann 
sampler for its pointing class QA. 



Sampler for Seq^ ltl y Let us start by introducing our Boltzmann sampler for 
Seg^i) and proving its correctness: 

Algorithm 3: r x Seq^i^ 

Input: the parameter x 

Output: a balanced sequence of Zb and Z w . 

1 Let L be a random variable in N* verifying ¥(L = I) = {jip- s^TJx) 

2 Draw I according to the law of L. 

3 Let M be a (2Z)-uple, select uniformly I positions belongs the 21 entries in 
M. 

4 This positions are the Zb entries of M, the other ones are the Z w entries. 

5 return (M). 

Lemma 1. Algorithm 3 is a valid Boltzmann sampler for Seq^iy 

Proof. Let a be an output of this algorithm. The probability to draw a is the 
probability to draw the right length and then to draw the right positions for Zb. 
So, 

\ a \\ x \a\ (Ml)2 x \a\ 

P(a)- 1 - - 



(^!) 2 (x)' \a\\ S ( i,i)(x) 

An isomorphism for 6>Cyc( 1;1 ). Another classical operator, in structural 
combinatorics, is the pointing operator which can be defined as follows : 

Definition 2. Let C be a combinatorial class, the combinatorial class QC is 
formally defined as ^ C n x {e\,...,e n } where the a are distinct neutral objects 

n>0 

(0-sized objects) and C n is the sub-class of C of the elements of size n. 

dC 

The generating function of the pointed class QC is C*(x) = x.——. We can 

dx 

interpret this operator by saying that each object in QC is an object in C with 
a tagged atom on it. 

Theorem 1. QCyci\^\ is isomorphic to ¥ , ( rl )^ e Pn( , S' e 9(i,i))' where Rep n {A) 

n>0 

is the class {aa^a; a E A}. 

n times 

Proof. Let Cu c be the generating function for Cyc(Z w + Zb), without any con- 
straint on the number of beads (Z w and Zb) of each color. We can write the 
generating function Cm c as follows : 

n>0 n>0 fe>0 



\ - (f(n) m-^ 1 \ - fc! pin 

n>0 fc>0 P1 +p 2 = k t r 

The notion of diagonal for a bivariate generating function is needed for what 
follows. Let f(X, Y) = ai^X l Y^ be a bivariate generating function, the func- 
tion g(Z) = J2 a n n Z 2n is called the (1,1) -diagonal of f(X, Y) and it is denoted 
by Af. 

So, by definition C(i ; i) = ACu c and we can use the previous formula to 
obtain : 

i>0 n>0 " p>0 ZP[P -> 

Now, pointing the Cyc(l,l) class yields : 

C f ( *i,i) = E^ n ) S (M)(^ n ) 

Tl>0 

At this stage, it is possible to draw an object of size n in Cyc(i_i^ using 
classical recursive method (12). But here we pitch on Boltzmann point of view 
which avoids costly preprocessing calculus. 

This isomorphism allows us to describe a Boltzmann sampler for OCyc(\X) '■ 

Algorithm 4: r x 0Cyc(i A ) 
Input: the parameter x 

Output: a (1, 1)— balanced cycle of and Z w . 
l n := 1 

„ _ yQ)5(i,i)0") 

3 Draw a real number u uniformly in [0, 1] 

4 while u > S do 

n := n + 1 

6 |_ 3 —3+ c. ^,) 

7 return [r a; (i?ep Il (5eg (l!l) ))] 

Corollary 1. Algorithm 4 is a valid Boltzmann sampler for OCyc^i^y 

Proof. This is a corollary of the correctness of our general sampler, given in 
section four. 

A Boltzmann Sampler for Cyc^^y We have obtained a Boltzmann sampler 
for OCyc^i^iy This is enough to uniformly generate (l,l)-balanced cycles. But, 
the sampler does not have a Boltzmann distribution for Cyc^i), so it can not 
be called by another constructor. For instance, Cyc{OCyc( X tl y) is not equal to 
Cyc(Cyc(i ij). Indeed, small objects are drawn with a smaller probability in 
OCyc( X1 } . To unbias the sampler, we are going to change its parameter according 
to a well-chosen density law f x {u). A similar idea occurs in (16). 



Lemma 2. Let C(x) = ^2 n>0 c n x n be a generating function (with C(0) = 0). 

C*ixu) 

For any fixed x in the convergence disc of C, the function f x (u) = — % is a 

uC(x) 

density of probability on [0, 1]. 

Proof. Clearly / is non negative. Now, it remains only to prove that 



-du = 1. Wc can expand the scric to C'(xu), / — - — -^rr\ du 

=o uC{x) J u=0 uC{x) 

We now swap the sum and integral, we have : 



u 
n 



= 1. 

u=0 



Theorem 2. The following sampler (Algorithm 5) gives a valid Boltzmann sam- 
pler for C with parameter x from a Boltzmann sampler for OC. 

Algorithm 5: F X C 
Input: the parameter x 
Output: an object in C. 

i , ,. i C'(xu) 

Draw a real number u according to the density law 



1 6 J u(c{x)-c y 

2 if (Bernoulli( ^ ,° . ) = 1) then 

3 | return an object in Co drawn uniformly. 

4 else 

5 |_ return r ux {0C) and forget the point. 



Proof. It is sufficient to evaluate the probability that the output be of size n. 

If n = 0, we have drawn an object in Co- This occurs with probability ^ , . If 

C(x) 

n > 0, the probability is (1 — — - ) • / — " - - du — " - . In every cases, 

C(x) J u=0 uC(x) C{x) 

this is a Boltzmann probability. 

This sampler allows us to generate extremely large (1, l)-cycles (more than 
1000000 beads). Indeed, this sampler is clearly linear in the size of the output. 
The following figures (Fig. 3) show a (l,l)-cycle of size 100 (Fig. 3(a)) (only 
100 beads for the legibility.) It also shows that we can compose our sampler 
with the classical builder (*,+, Seq,Cyc, MSet, PSet, ...). Figure 3(b) shows a 
random generated a necklace of (1, l)-cycles. We can see that the necklaces do 
not contain a lot of (1, l)-cycles. Moreover only one of these (1, l)-cycles contains 
a lot of beads. 

3 The general vectorial case 

In this part, we extend the previous method to all cases, algebraic or not. Let 

m 

v = (vi,V2, ■■■v m ), and \v\ — Yl v i- Ovrr goal here is to generate Cyc v : cycles 



(a) random (l,l)-cycle of size (b) A random necklace of 
100. (l,l)-cycles of size 200. 



Fig. 3. Examples of Boltzmann sampling. 



of m colors, such that the number of occurences of each atom Zi verify the 
f-balance condition. We follow the same principles than in section 3.2 but the 
proofs are slightly more technical. 



3.1 Sampler for Seq v 

Let us recall S v , the generating function of w-balanced sequences: 

p>o n (Kf)o i=i 

i=l 

Algorithm 6: r x Seq v 



Input: the parameter t 

Output: a w-balanced sequence of Zi. 

Let I be a random variable in N* verifying F(L = I) 



(\v\l)\ t'=i 

m q 

nM) v 

i=i 



i 

2 Draw I according to the law of L. 

3 Let M be a (|«|Z)-uple, 

4 for i from 1 to m do 
Select uniformly Ivi positions belongs the (\v\l) entries not yet affected 
in M. 

This positions are the Zi entries of M. 
7 return (M). 

Lemma 3. Algorithm 6 is a valid Boltzmann sampler for Seq v . Its arithmetic 
complexity is linear in the size of its output object. 



Proof. The proof can be easily transposed from (1, l)-balanced one. The com- 
plexity result is trivial. 

Now, as with the example of (1, l)-balanced cycles, we are going to use this 
sampler for w-balanced sequences to generate w-balanced cycles. 



3.2 An isomorphism for 0Cyc v 

Theorem 3. 0Cyc v is isomorphic to <p( n )RePn(Seq v ). 

Proof. Let C mco \ be the generating function of cycles of m atoms Z\,..., Z m . 



ft_ . £ ^ii , og (l - <£ Z -)) _ y. ^ £ - k (£ 

n>0 i=l n>0 k>0 i=l 



x - tp(n) ^ 1 x - fc! i r „ Pi 

o mco i - 2^ 2^ 2^ 11 z * 

«>° *>o g Pt=k n (PiO i=i 

i=l i_ 1 

Let C„ be the generating function of Cyc v . This is the extraction of terms 
with the exponents verifying the v-balanced condition in C mco \. 

;>o »=i ™>o n P >o n((«iP)0 

We will now apply the same idea that we described for the (1, l)-balanced 
case to 0C V (the generating function of which is C*). 

C' v = J2 ¥>(») J2 xMnP = E v(n)S v (X n ) 

n>0 p>0 n((Uip)!) n>0 

i=l 

This isomorphism can be used to obtain the following sampler: 

Algorithm 7: F x OCyc v 
Input: the parameter x 
Output: a w-balanced cycle of Zi. 

1 n := 1 

2 13 — C^) 

3 Draw a real number u uniformly in [0, 1] 

4 while u > S do 

n := n + 1 

c c i y(n)S„(a:") 

6 |^ D . D "T c . (;r) 

7 return [r a; (i?ep„(5eg (l!l) ))] 



Proposition 3. Algorithm 7 is a valid Boltzmann sampler for 0Cyc v . Its arith- 
metic complexity is linear in the size of its output object. 

Proof. Let us consider the generation of a pointing cycle c. 
It can be written as c = u p , where u a primitive sequence {i.e. without replica- 
tion) and p is the primitive repetition order of c (|c| = p\u\). There are s — \u\ 
shifts of u which produce equivalent cycles. 

In Algorithm 7, the generating sequence is not necessarily primitive. So, c can 
be drawn as any u d , with d\p and u a shift of § repetitions of u. 

So, the probability to draw u is the sum for all d\p of the probability to draw 
d as repetition order and to draw one of the s shift of u as motif : 



To obtain a w-balanced cycle from an object of 0Cyc v , we can now apply 
the general algorithm 5 to OCyc v . As proven previously, this provides us a 
Boltzmann sampler for Cyc v . 

4 Conclusion 

The random generation of constrained-colored structures is in general very dif- 
ficult. In a previous paper (3), we already investigated the generation of the 
/c-colored structures and size-colored structures. In this short paper, we have 
presented a way to efficiently generate ^-balanced cycles. It is possible with our 
samplers to generate v-balanced cycles of sizes reaching up to one million. Nev- 
ertheless our methods can not be directly generalized to other balanced struc- 
tures. For instance, we do not know how to generate (1, l)-balanced general 
non planar (unlabcllcd) trees where general non planar trees can be specified as 
T = Z.MSet(T). This problem is a work in progress and should be solved by 
a method involving multivariate Boltzmann sampler. Another perspective is the 
generation of semi-labelled structures. In these structures each atom can take a 
color in {1, k} but, if we have an atom of color k > 1, we need to also have at 
least one atom of color k — 1. Semi- labelling is a new interesting labelling which 
is in a sense between unlabelling and labelling. But very little is as of yet known 
about it. 

Finally, we want to thank, B. Salvy for his precious knowledge on hyperge- 
ometric functions and Joachim Dchais, Jeremie Lumbroso and Yann Ponty for 
their careful reading of a preliminary version of the manuscript. 




P(c, i) = AP(c) = with i G [|l,f|], the choice of i corresponding to the 
pointed atom. The complexity ensues from the results on Boltzmann sampling. 
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